现在,基于BERT的上下文排名模型已在各种段落和文档排名任务中已建立。但是,在对抗输入下基于BERT的排名模型的鲁棒性不足。在本文中,我们认为,伯特级居民对针对检索文件的对抗性攻击并不免疫。首先,我们提出了使用基于梯度的优化方法对高度相关和非相关文档的对抗扰动算法。我们的算法的目的是将少量令牌添加到高度相关或非相关的文档中,以引起大量降级或晋升。我们的实验表明,少数令牌已经可以导致文档等级发生很大变化。此外,我们发现伯特级速率在很大程度上依靠文档开始/头来进行相关性预测,从而使文档的初始部分更容易受到对抗攻击的影响。更有趣的是,我们发现一小部分反复出现的对抗性词,将这些单词添加到文档中后,这些单词分别导致任何相关/非相关/非相关文件的成功级别降级/促进。最后,我们的对抗令牌还显示了数据集内部和跨数据集内的特定主题偏好,从而暴露了BERT预训练或下游数据集中的潜在偏见。
translated by 谷歌翻译
顺序建议要求推荐人从已记录的用户行为数据中捕获不断发展的行为特征,以进行准确的建议。但是,用户行为序列被视为具有多个正在进行的线程交织在一起的脚本。我们发现,只有一小部分关键行为才能发展为用户的未来动作。结果,用户的未来行为很难预测。我们将每个用户作为行为途径的顺序行为的特征得出结论。不同的用户具有独特的行为途径。在现有的顺序模型中,变压器在捕获全球依赖性特征方面表现出很大的能力。但是,这些模型主要使用自我注意力的机制在所有先前的行为上提供了密集的分布,这使得最终预测被未调整给每个用户的微不足道行为所淹没。在本文中,我们使用新颖的途径注意机制构建了推荐变压器(RETR)。 REOR可以动态地计划为每个用户指定的行为途径,并通过此行为途径很少激活网络,以有效捕获对推荐有用的演变模式。关键设计是一种博学的二进制途径,以防止行为途径被微不足道的行为淹没。我们从经验上验证了RERO在七个现实世界数据集中的有效性,并产生了最先进的性能。
translated by 谷歌翻译
Muilti-Delicality数据在生物学中普遍存在,特别是我们进入了多OMICS时代,当我们可以测量来自不同方面(OMIC)的相同生物对象(单元)来提供更全面的洞察蜂窝系统。在处理此类多个OMICS数据时,第一步是确定不同模式之间的对应关系。换句话说,我们应该与与相同对象相对应的不同空格匹配数据。这个问题在单细胞多OMICS场景中特别具有挑战性,因为这种数据具有极高的尺寸。其次,匹配的单细胞多OMICS数据是罕见的且难以收集的。此外,由于实验环境的局限性,数据通常非常嘈杂。为了促进单细胞多OMICS研究,我们克服了上述挑战,提出了一种新颖的框架来对齐和集成单细胞RNA-SEQ数据和单细胞ATAC-SEQ数据。我们的方法可以通过在统一空间中有效地将上述数据与来自不同空间的高稀疏性和噪声从不同空间的噪声映射到低维歧管,使下游对准和直接集成。与其他最先进的方法相比,我们的方法在模拟和实际单细胞数据中执行更好。所提出的方法有助于单细胞多OMICS研究。对模拟数据集成的改进是显着的。
translated by 谷歌翻译
实时动态环境感知对于拥挤空间的自动机器人至关重要。尽管流行的基于体素的映射方法可以有效地用任意复杂的形状代表3D障碍,但它们几乎无法区分静态和动态障碍,从而导致避免障碍物的性能有限。尽管在自动驾驶中存在大量基于学习的动态障碍检测算法,但四轮驱动器的有限计算资源无法使用这些方法实现实时性能。为了解决这些问题,我们为使用RGB-D摄像机提出了一个实时动态障碍物跟踪和映射系统,以避免四肢障碍物。拟议的系统首先利用带有占用体素图的深度图像来生成潜在的动态障碍区域作为建议。通过障碍区域建议,Kalman滤波器和我们的连续性过滤器将应用于跟踪每个动态障碍物。最后,使用追踪动态障碍的状态基于马尔可夫链提出了环境感知的轨迹预测方法。我们使用定制的四轮驱动器和导航计划者实施了建议的系统。仿真和物理实验表明,我们的方法可以成功地跟踪和代表动态环境中的障碍,并安全地避免障碍。
translated by 谷歌翻译
导航动态环境要求机器人生成无碰撞的轨迹,并积极避免移动障碍。大多数以前的作品都基于一个单个地图表示形式(例如几何,占用率或ESDF地图)设计路径计划算法。尽管他们在静态环境中表现出成功,但由于地图表示的限制,这些方法无法同时可靠地处理静态和动态障碍。为了解决该问题,本文提出了一种利用机器人在板载视觉的基于梯度的B-Spline轨迹优化算法。深度视觉使机器人能够基于体素图以几何形式跟踪和表示动态对象。拟议的优化首先采用基于圆的指南算法,以近似避免静态障碍的成本和梯度。然后,使用视觉检测的移动对象,我们的后水平距离场同时用于防止动态碰撞。最后,采用迭代重新指导策略来生成无碰撞轨迹。仿真和物理实验证明,我们的方法可以实时运行以安全地导航动态环境。
translated by 谷歌翻译
我们提出了一种新型的元学习方法,用于对未知物体的6D姿势估计。与“实例级”构成估计方法相反,我们的算法以类别 - 不合命相的方式学习对象表示,从而在对象类别中赋予其具有强大的概括能力。具体而言,我们采用条件神经过程的元学习方法来训练编码器,以基于很少的RGB-D图像和地面真实关键点,以潜在表示中捕获对象的纹理和几何形状。然后,同时进行元训练的解码器使用潜在表示,以预测新图像中对象的6D姿势。为了评估我们的算法,在多个场景(MCMS)中从多个类别生成的新的全通道合成数据集进行了实验。实验结果表明,我们的模型在具有各种形状和外观的看不见的物体上表现良好。
translated by 谷歌翻译
In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection. Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes. The spatial alignment of multi-modal tokens is performed implicitly, by encoding the 3D points into multi-modal features. The core design of CMT is quite simple while its performance is impressive. CMT obtains 73.0% NDS on nuScenes benchmark. Moreover, CMT has a strong robustness even if the LiDAR is missing. Code will be released at https://github.com/junjie18/CMT.
translated by 谷歌翻译
Knowledge graphs (KG) have served as the key component of various natural language processing applications. Commonsense knowledge graphs (CKG) are a special type of KG, where entities and relations are composed of free-form text. However, previous works in KG completion and CKG completion suffer from long-tail relations and newly-added relations which do not have many know triples for training. In light of this, few-shot KG completion (FKGC), which requires the strengths of graph representation learning and few-shot learning, has been proposed to challenge the problem of limited annotated data. In this paper, we comprehensively survey previous attempts on such tasks in the form of a series of methods and applications. Specifically, we first introduce FKGC challenges, commonly used KGs, and CKGs. Then we systematically categorize and summarize existing works in terms of the type of KGs and the methods. Finally, we present applications of FKGC models on prediction tasks in different areas and share our thoughts on future research directions of FKGC.
translated by 谷歌翻译
Few Shot Instance Segmentation (FSIS) requires models to detect and segment novel classes with limited several support examples. In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework. Our key insights are two folds: Firstly, with the aid of support masks, we can generate dynamic class centers more appropriately to re-weight query features. Secondly, we find that support object queries have already encoded key factors after base training. In this way, the query features can be enhanced twice from two aspects, i.e., feature-level and instance-level. In particular, we firstly design a mask-based dynamic weighting module to enhance support features and then propose to link object queries for better calibration via cross-attention. After the above steps, the novel classes can be improved significantly over our strong baseline. Additionally, our new framework can be easily extended to incremental FSIS with minor modification. When benchmarking results on the COCO dataset for FSIS, gFSIS, and iFSIS settings, our method achieves a competitive performance compared to existing approaches across different shots, e.g., we boost nAP by noticeable +8.2/+9.4 over the current state-of-the-art FSIS method for 10/30-shot. We further demonstrate the superiority of our approach on Few Shot Object Detection. Code and model will be available.
translated by 谷歌翻译
Graph Neural Networks (GNNs) have shown satisfying performance on various graph learning tasks. To achieve better fitting capability, most GNNs are with a large number of parameters, which makes these GNNs computationally expensive. Therefore, it is difficult to deploy them onto edge devices with scarce computational resources, e.g., mobile phones and wearable smart devices. Knowledge Distillation (KD) is a common solution to compress GNNs, where a light-weighted model (i.e., the student model) is encouraged to mimic the behavior of a computationally expensive GNN (i.e., the teacher GNN model). Nevertheless, most existing GNN-based KD methods lack fairness consideration. As a consequence, the student model usually inherits and even exaggerates the bias from the teacher GNN. To handle such a problem, we take initial steps towards fair knowledge distillation for GNNs. Specifically, we first formulate a novel problem of fair knowledge distillation for GNN-based teacher-student frameworks. Then we propose a principled framework named RELIANT to mitigate the bias exhibited by the student model. Notably, the design of RELIANT is decoupled from any specific teacher and student model structures, and thus can be easily adapted to various GNN-based KD frameworks. We perform extensive experiments on multiple real-world datasets, which corroborates that RELIANT achieves less biased GNN knowledge distillation while maintaining high prediction utility.
translated by 谷歌翻译